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Although rodents are important reservoirs for RNA viruses, to date only one species of rodent 
coronavirus (CoV) has been identified. Herein, we describe a new CoV, denoted Lucheng Rn rat 
coronavirus (LRNV), and novel variants of two Betacoronavirus species termed Longquan Aa mouse 
coronavirus (LAMV) and Longquan RI rat coronavirus (LRLV), that were identified in a survey of 1465 
rodents sampled in China during 2011-2013. Phylogenetic analysis revealed that LAMV and LRLV fell into 
lineage A of the genus Betacoronavirus, which included CoVs discovered in humans and domestic and 


Keywords: wild animals. In contrast, LRNV harbored by Rattus norvegicus formed a distinct lineage within the genus 
Coronavirus Alphacoronavirus in the 3CL?'°, RdRp, and Hel gene trees, but formed a more divergent lineage in the N 
Bee and S gene trees, indicative of a recombinant origin. Additional recombination events were identified in 
Rodents LRLV. Together, these data suggest that rodents may carry additional unrecognized CoVs. 
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Introduction 


Coronaviruses (CoVs; family Coronaviridae) are the etiological 
agent(s) of respiratory, enteric, hepatic, and neurological diseases 
in animals and humans. The first coronavirus (infectious bronchitis 
virus) was isolated in chicken embryos in 1937 (Beaudette and 
Hudson, 1937), with subsequent viral isolations in rodents, domes- 
tic animals, and humans. However, until the emergence of severe 
acute respiratory syndrome (SARS) in China in 2002/3 (Drosten et 
al., 2003; Woo et al., 2009), coronaviruses had been of greater 
concern to agriculture than public health. Since the discovery of 
SARS-CoV intense scientific efforts have been directed toward 
characterizing additional coronaviruses in humans and other 
animals (Drexler et al., 2010; Guan et al., 2003; Lau et al., 2005; 
Li et al., 2005; Quan et al., 2010; van der Hoek et al., 2004; Woo 
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et al., 2012). As a consequence, the number of coronaviruses 
identified has increased rapidly (Woo et al., 2009, 2012). Of 
particular importance was the recent discovery of a new severe 
respiratory illness with renal failure (Middle East Respiratory 
Syndrome, MERS) caused by a novel coronavirus (MERS-CoV) 
(Bermingham et al., 2012; van Boheemen et al., 2012), and which 
is also a zoonosis (Annan et al., 2013; Azhar et al., 2014; Reusken 
et al., 2013). It is highly likely that there are additional unrecog- 
nized coronaviruses circulating in animals. 

Rodentia (rodents) is the largest order of mammals with 
approximately 2277 species worldwide, representing some 42% 
of all mammalian species (Wilson and Reeder, 2005). Rodents are 
a major zoonotic source of human infectious diseases (Meerburg 
et al., 2009; Luis et al., 2013), particularly as they often live at high 
densities and hence may harbor high levels of microbial diversity 
(Moya et al., 2004). In addition, some rodent species live in close 
proximity to humans, such that they represent an important 
zoonotic risk. To date, however, only one species of coronavirus 
- Murine coronavirus - has been associated with rodents 
(de Groot et al., 2011). The prototype virus, which was named 
mouse hepatitis virus (MHV), was first isolated in mice in 1949 
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(Cheever et al., 1949), with a variant then identified in rats in 1970 
(where it was termed rat sialodacryoadenitis coronavirus) (Parker 
et al., 1970). No other rodent-associated CoVs have been discov- 
ered since this time. 

Although RNA viruses are often characterized by their high 
rates of mutation, recombination may also be of evolutionary 
importance, and has been associated with such characteristics as 
the ability to infect new hosts and alter virulence (Holmes, 2013). 
Recombination appears to be commonplace in coronaviruses 
(Graham and Baric, 2010; Jackwood et al., 2012; Keck et al. 
1987; Woo et al., 2006), and which may facilitate their emergence. 
For example, two types of feline CoVs (FCoV) — FCoV type I and II - 
have arisen by double recombination events between FCoV types I 
and canine Coronavirus (CCoV) (Herrewegh et al., 1998). Similarly, 
recombination generated the three genotypes (A, B and, C) of 


Table 1 
Prevalence of coronaviruses in rodents in Zhejiang Province, China. 


Species Longquan Wencheng Lucheng Total 
Residential field Residential field Residential 

Apodemus - 10/427 - 0/17. - 10/444 
agrarius 

Mus musculus 0/3 - 0/4 - - 0/7 

Microtus fortis 0/44 0/261 - - - 0/305 

Micromys - 0/2 - - - 0/2 
minutus 

Niviventer - 1/58 - 0/27 - 1/85 
confucianus 

Rattus 3/214 - 0/31 7 1/17 4/262 
norvegicus 

R. lossea - 14/300 = ofl - 14/301 

R. tanezumi 0/25 1/7 0/18 - 0/3 1/53 

R. fulvescens 0/1 0/3 - - - o/4 

R. edwardsi - - - 0/2. - 0/2 

Total 3/287 26/1058 0/53 0/47 1/20 30/1465 

Note: CoV RNA positive specimens/total specimens; “-” no animals were captured. 
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human coronavirus HKU1 (Woo et al., 2006), and homologous 
recombination has occurred in the evolutionary history of SARS- 
CoV (Graham and Baric, 2010). 

To explore the diversity and evolution of CoVs in rodent 
populations we screened rodents collected from rural regions of 
Zhejiang province, China. This revealed a remarkable diversity of 
CoVs circulating in rodents, along with evidence for cross-species 
transmission and recombination. 


Results 
Collection of rodents, and the identification of coronaviruses 


A total of 1465 rodents representing 10 different species were 
captured from three locations in Zhejiang province, China during 
2011-2013 (Table 1 and Fig. 1). RT-PCR targeting a conserved 
sequence of the viral RdRp (RNA-dependent RNA polymerase) 
gene was performed to detect coronaviruses. PCR products of the 
expected size were recovered from 10 Apodemus agrarius, 4 Rattus 
norvegicus, 14 R. lossea, 1 R. tanezumi, and 1 Niviventer confucianus, 
such that approximately 2% of rodents were positive for CoV 
(Table 1). The classification of these viruses as CoVs (Family 
Coronaviridae, Genus Alpha- and Beta-) was confirmed by genetic 
analyses (see below). 


Genetic characterization of viral sequences 


To better characterize the rodent CoVs discovered here, complete 
viral RdRp gene sequences were recovered from 21 (70%) of the RNA 
positive rodent samples described above. Additionally, 1 complete 
and 4 near complete ( > 98%) viral genome sequences were success- 
fully recovered from five positive CoV samples (Table S1). Genetic 
analysis indicated that two CoVs sampled from R. norvegicus in 
Lucheng and Longquan shared 50.6%-71.7% nucleotide sequence 
similarity with alpha coronaviruses; 15 CoVs from 13 R. lossea, 1 R. 
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Fig. 1. A map of Zhejiang province, China showing the location of trap sites in which rodents were captured and surveyed for coronaviruses. 
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tanezumi, and 1 N. confucianus from Longquan had 78.4%-89.5% 
nucleotide sequence similarity with murine coronavirus and 76.4%- 
85.7% nucleotide sequence similarity with human coronavirus HKU1; 
and 13 CoVs from 10 A. agrarius, 2 R. norvegicus and 1 R. lossea from 
Longquan had 60.3%-85.9% nucleotide sequence similarity with 
rabbit coronavirus HKU14, isolated from domestic rabbits in Guangz- 
hou, China (Lau et al., 2012) (see phylogenetic results below). Overall, 
we designated these newly described viruses as Longquan Aa mouse 
coronavirus (LAMV), Longquan RI rat coronavirus (LRLV), and 
Lucheng Rn rat coronavirus (LRNV), reflecting their host species 
and the geographic location of sampling. 

Further comparison of the CoV replicase domains |[i.e. ADP- 
ribose 1’-phosphatase (ADRP), chymotrypsin-like protease 
(3CLP’°), RdRp, helicase (Hel), 3’-to-5’ exonuclease (ExoN), nido- 
viral endoribonuclease specific for uridylate (NendoU) and ribose- 
2’-O-methyltransferase (O-MT)] revealed that LRNV Lucheng-19 
was < 90% similar in amino acid sequence to known members of 
the genus Alphacoronavirus (Table S2). Hence, these data suggest 
that LRNV is sufficiently divergent that it represents a novel 
species of coronaviruses according to the criteria for species 
demarcation in the subfamily Coronavirinae defined by the Inter- 
national Committee on Taxomony of Viruses (ICTV) (de Groot 
et al., 2011). With respect to the other two viruses, LRLV Long- 
quan-189 was <90% similar to known members of the genus 
Betacoronavirus in the ADRP and NendoU regions, suggesting that 


LRNV | 


HKU2 1 


it represents a new variant of murine coronavirus (see phyloge- 
netic analysis). Although LAMV Longquan-343 was nearly 90% 
similar in the conserved replicase domains to betacoronavirus 1 
(ie. Human coronavirus OC43, Bovine coronavirus, Porcine 
hemagglutinating encephalomyelitis virus, and Rabbit coronavirus 
HKU14) and < 90% to other members of the genus Betacoronavirus 
(Table S2), it does not exhibit sufficient sequence divergence to 
represent a new virus species. Hence, we classify it as a new 
variant of Betacoronavirus 1. 

The viral genome sequences obtained in this study were 
compared with Rhinolophus bat coronavirus HKU2 virus, a mem- 
ber of the genus Alphacoronavirus (Lau et al., 2007), and MHV 
(Cheever et al., 1949), human coronavirus HKU1 (Woo et al., 2006), 
rabbit coronavirus HKU14 (Lau et al., 2012), which are all members 
of the genus Betacoronavirus, (Fig. 2). A previous description of the 
genome organization of CoVs (de Groot et al., 2011) was used as a 
reference. LRNV (Lucheng-19) had a genome of 28,763 nucleotides, 
with a G+C content of 40.2%. Its genome organization was similar 
to those of members of the genus Alphacoronavirus, with the 
characteristic 5’-replicase ORF1ab-S-envelope(E)- membrane(M)- 
N-3’ gene order (Fig. 2, Table 2, S3). The replicase ORF1ab (20,249 
nucleotides in length) includes 16 predicted nonstructural proteins 
(nsp) (Table S3). In a manner similar to Rhinolophus bat corona- 
virus HKU2 and human coronavirus NL63, LRNV possessed the 
core part of the putative transcription regulatory sequence (TRS) 
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Fig. 2. Genome organization of coronaviruses. The three CoVs discovered in this study are shown in bold. The star signifies the presence of a betacoronavirus-like NS2a gene 


in LRNV. 


22 
Table 2 
Coding of potential and putative transcription regulatory sequences of the LRNV (Lucheng-19) genome sequence. 
ORF Location (nt) Length (nt) Length (aa) 
lab 332-20,580 (shift at 12,538) 20,249 6749 
NS2 20,577-21,404 828 275 
S 21,412-24,816 3405 1134 
NS4 24813-25457 645 214 
E 25,457-25,693 237 78 
M 25,703-26,449 747 248 
NS7a 26,461-26,958 498 165 
NS7b 26,555-26,926 372 123 
N 26,974-28,149 1176 391 
NS8a 27,607-28,149 543 180 
NS9 28,151-28,465 315 104 
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TRS location No. of matching base pairs TRS sequence (s) 

compared to leader (distance in bases to AUG) 

TRS (body/leader) 

65 CAACUCAACUAAACGA(251 )AUG 

20,565 11/12 CAACUUAACUAAAUG 
21,398 11/14 UGACUAAACUAAACAUG 
25,446 10/12 CCACUUAACUAAUG 
25,690 9/13 UUGAUCAACUAAAAUG 
26,443 11/16 GGUCUAAACUAAACCA(2)AUG 
26,511 9/16 CAUUAAAACUAAUUGU(28)AUG 
26,960 8/14 AGUUUCAACUAACAAUG 
27,555 9/16 UGAUAGAACUAAAGAA(36)AUG 
28,134 9/16 UGAUGAAACUAAUUGA(1 )AUG 


Numbers in parentheses represent the number of nucleotides to the putative start codon. Start codons are underlined. The conserved TRS core sequence, AACUAA, is 


highlighted in bold. 


Table 3 
Characteristics of the spike protein in LRLV and LAMV. 


Spike protein Strain Signal peptide Receptor binding domain Cleavage site Heptad repeat Transmembrane domain 

LRLV Longquan-708 1-14 325-533; 586-669 754-755 1009-1122; 1257-1288 1307-1329 
Longquan-370 1-13 319-525 762-763 996-1109; 1244-1275 1294-1316 
Longquan-189 1-13 319-525 762-763 996-1109; 1244-1275 1294-1316 

LAMV Longquan-343 1-15 327-497 716-777 1004-1117; 1252-1290 1302-1324 


5’-AACUAA-3’ upstream of the 5’ end of each ORF with the 
exception NS4, with variable nucleotides matching the leader core 
sequences (Table 2). Additionally, LRNV contained NS7a and NS7b 
genes between the M and N genes (Fig. 2), and which are observed 
in no other members of the genus Alphacoronavirus. Perhaps the 
most striking feature of the LRNV genome was that NS2 encodes a 
putative nonstructural protein of 275 amino acids located between 
the replicase ORFlab and the S gene (Fig. 2, Table 2). A BLAST 
search revealed that this NS2 had no amino acid sequence 
similarity with alpha-CoVs, but possessed approximately 42% 
amino acid identity with the NS2a of lineage A of beta-CoVs. 
Hence, this is suggestive of a homologous recombination event 
between alpha-CoVs and beta-CoVs (and which was confirmed in 
the phylogenetic analysis below). 

In contrast, the genome organization of LRLV and LAMV were 
similar to those of members of the lineage A CoVs of the genus 
Betacoronavirus, with the characteristic 5’-ORF1ab-hemagglutinin- 
esterase (HE)-S-E-M-N-3’ gene order (Fig. 2). In both LAMV and 
LRLV the NS5a and NS5b nonstructural proteins were located 
between the S and E genes. Interestingly, however, NS5a was not 
observed in the strain Longquan-370 (LRLV). The TRS of LAMV and 
LRLV appeared in two forms — the CUAAAC and CCAAAC type. The 
S protein of LRLV were 1350-1366 amino acids in length, with 
65%-89% amino acid identity to the S proteins of other lineage A 
CoVs of the genus Betacoronavirus, while the S protein of LAMV 
were 1358 amino acids in length with 62%-68% amino acid 
identity to other lineage A CoVs of the genus Betacoronavirus. 
Finally, the S proteins of both LRLV and LAMV contain a potential 
signal peptide, receptor binding domain, a potential S1 and S2 
cleavage site, two heptad repeats and one transmembrane domain 
(Table 3). 


Phylogenetic relationships among the CoVs 


To determine the evolutionary relationships among the novel 
CoVs discovered here and those found previously, we inferred 
phylogenetic trees based on the amino acid sequences of the 
3CLP"°, RdRp, Hel, S and N proteins (Figs. 3 and 4). Consistent with 


previous work (de Groot et al., 2011), all CoVs fell into two well 
supported groups, corresponding to the Alphacoronavirus and 
Betacoronavirus genera respectively. With respect to the viruses 
identified here, LRNV clustered within the genus Alphacoronavirus, 
while LAMV and LRLV fell into the genus Betacoronavirus. 

One of the most striking observations from the phylogenetic 
analysis was that the sequences from LRNV were located at five 
different positions in the phylogenetic trees, strongly suggestive of 
recombination (Figs. 3 and 4). Specifically, in the 3CLP’° and RdRp 
gene trees, LRNV clustered as a member of the genus Alphacor- 
onavirus. LRNV also fell with alphacoronaviruses in the Hel genes 
tree, but as a basal lineage. Strikingly, however, LRNV clustered 
with the betacoronaviruses in the N gene tree, and formed a 
divergent lineage in S gene tree with Rhinolophus bat coronavirus 
HKU2, which has previously been shown to be a recombinant (Lau 
et al., 2007). 

In contrast, LAMV and LRLV consistently clustered within the 
lineage A of the genus Betacoronavirus, which also contained 
human CoV HKU1, MHV, rabbit coronavirus HKU14, and human 
CoV OC43 viruses. Notably, however, in the 3CLP'°, RdRp, and N 
gene trees (Fig. 3), two strains (Longquan-189 and Longquan-370) 
formed a monophyletic group with human coronavirus HKU1 
virus, which was isolated from a patient with pneumonia in Hong 
Kong in 2005 (Woo et al., 2005). Interestingly, Longquan-708 was 
closely related to Longquan-370 and Longquan-189 in the 3CLP", 
RdRp, and Hel genes, yet clustered with MHV and rat coronavirus 
in the S gene tree (Fig. 4), and represented a distinct lineage in the 
N gene tree (Fig. 3). Such variable grouping suggests that 
Longquan-708 may be a recombinant between two rodent CoV 
lineages. Finally, in the 3CLP™, RdRp, and S gene trees, LAMV 
(Longquan-343) occupied the most divergent phylogenetic posi- 
tion in a group of viruses that contained human CoV OC43 (St-Jean 
et al., 2004) as well as viruses sampled from domestic and wild 
animals (Hasoksuz et al., 2007; Lau et al., 2012; Lim et al., 2013; 
Vijgen et al., 2006). However, LAMV comprised a distinct lineage in 
the Hel and N gene trees. 

To investigate these putative recombination events in more detail, 
we undertook additional sequence analyses (Fig. 5 and Fig. S1). 
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Fig. 3. Phylogenetic analyses of the amino acid sequences of the 3CL’’°, RdRp, Hel, and N genes of Lucheng Rn rat coronavirus (LRNV) Lucheng-19, Longquan Aa mouse 
coronavirus (LAMV) Longquan-343, and Longquan RI rat coronaviruses (LRLV) Longquan-708, Longquan-189, and Longquan-370. Numbers ( > 70) above or below branches 
indicate percentage bootstrap values. The trees were mid-point rooted for clarity only. The scale bar represents the number of amino acid substitutions per site. The GenBank 
accession numbers of the viruses used in this analysis are shown in Table S1. 


Multiple methods within the RDP program supported statistically 
significant recombination events in LRLV strain Longquan-708 
(p< 1.022~ “° to p<5.908~ '”). Similarity plots suggested the pre- 
sence of three recombination breakpoints at nucleotide positions 


19,294, 19,979, and 22,112, which separated the genome into four 
regions (Fig. 5A). In turn, these could be grouped into two putative 
‘parental regions’; region A (nt 1 to 19,915 and 20,600 to 22,784) and 
region B (19,915 to 20,600 and 22,784 to the end of the sequence). 
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Fig. 4. Phylogenetic analyses of the amino acid sequences of the S genes of LRNV, LAMV and LRLV. Numbers ( > 70) above or below branches indicate percentage bootstrap 
values. The trees were mid-point rooted for clarity only. The scale bar represents the number of amino acid substitutions per site. The GenBank accession numbers of other 


CoVs are given in Table S1. 


In parental region A, Longquan-708 was most closely related to 
Longquan RI Rat coronaviruses, while in parental region B it was 
more closely related to MHV. This recombination event was confirmed 
by phylogenetic analyses, in which the alternative grouping of 
Longquan-708 was supported with high bootstrap values (Fig. 5B 
and C). 

In contrast, although readily apparent in the amino acid phyloge- 
nies, the recombination event involving LRNV did not receive sig- 
nificant statistical support in the RDP analysis, likely because the latter 
utilizes nucleotide sequences and these are highly divergent (for 
example, the S protein of LRNV differs from those of alphacorona- 
viruses by >78% at the nucleotide sequence analysis). Similar 
suggestions have previously been made with respect to recombination 
in Rhinolophus bat coronavirus HKU2 (Lau et al., 2007). 


Discussion 


We screened for coronaviruses in 1465 rodents representing 10 
different species sampled in three locations in Zhejiang province, 
southeastern China. This survey identified a novel and phyl- 
ogenetically distinct coronavirus in R. norvegicus - Lucheng Rn rat 


coronavirus (LRNV) - which belonged to the genus Alphacoronavirus. 
According to the criteria defined by ICTV (de Groot et al., 2011), LRNV 
was sufficiently genetically distinct that it should be recognized as a 
distinct species within the family Coronaviridae. However, the other 
two viruses identified - Longquan Aa mouse coronavirus (LAMV) and 
Longquan RI rat coronavirus (LRLV) - belong to the established 
species betacoronavirus 1 and murine coronavirus, respectively. 
More generally, the presence of all three viruses indicates that 
genetically diverse CoVs co-circulate in rodents in Zhejiang province. 

It is notable that rodent-associated CoVs comprise a major 
proportion of the known genetic diversity in lineage A CoVs of the 
genus Betacoronavirus. This lineage contains viruses that cause 
enteric and respiratory diseases in humans (human coronavirus 
HKU1 and OC43) as well as in domestic animals (e.g. hemagglu- 
tinating encephalomyelitis in pigs) (St-Jean et al., 2004; Vijgen 
et al., 2006; Woo et al., 2005; Zhang et al., 1994). Clearly, the role 
by rodents in the evolution of lineage A CoVs of the genus 
Betacoronavirus merits further investigation. 

This study also provides the first evidence of CoVs of the genus 
Alphacoronavirus in Rattus rats (R. norvegicus), in the form of LRNV. 
However, our phylogenetic analysis suggested that this virus had 
a recombinant origin, with its N gene sequence more closely 
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Fig. 5. Recombination within the genome of LRLV Longquan-708. A sequence similarity plot (A) reveals three recombination break-points shown by black dashed lines, with 
their locations indicated at the bottom. The plot shows genome scale similarity comparisons of the Longquan-708 (query) against Longquan RI Rat Coronavirus (parental 
group 1, red) and Murine hepatitis virus (parental group 2, blue). The background color of parental region A is white, while that of parental region B is gray. Phylogenies of 
parental region A (B) and parental region B (C) are shown below the similarity plot. Numbers ( > 70) above or below branches indicate percentage bootstrap values. The 


GenBank accession numbers of the viruses used in this analysis are shown in Table S1. 


related to those of the genus Betacoronavirus (Fig. 4). Recombination 
appears to be commonplace in coronaviruses (Woo et al., 2009), 
particularly within closely related viruses such as MHV variants 
(Smits et al., 2005). Nevertheless, only a few examples of inter- 
genotype recombination, involving CoVs from bats (Hon et al., 2008) 
and felines (Herrewegh et al., 1998), have been documented to date. 
Hence, the observation that LRNV has a recombinant origin is 
significant because it means that recombination can occur between 
viruses assigned to different genera. 

Alpha- and beta-CoVs are largely associated with mammals, 
whereas gamma- and delta-CoVs are largely harbored by avian 
species (Woo et al., 2012). Because much of the genetic diversity of 
alpha- and beta-CoVs is associated with infections in bats, it has 
been suggested that bats are the main reservoir hosts for both 
alpha- and beta- CoVs (Woo et al., 2009). Herein, we discovered 
three phylogenetically distinct lineages of rodent-associated CoVs 
within a limited geographic area in China, all of which are distinct 
from those viruses associated in bats. Consequently, it is clear that 
large-scale surveillance is needed to fully understand the role 
played by rodents in the evolution and emergence of coronaviruses. 


Material and methods 
Ethics statement 
This study was reviewed and approved by the ethics committee 


of the National Institute for Communicable Disease Control and 
Prevention of the Chinese CDC. All animals were treated strictly 


according to the guidelines for the Laboratory Animal Use and Care 
from the Chinese CDC and the Rules for the Implementation of 
Laboratory Animal Medicine (1998) from the Ministry of Health, 
China, under the protocols approved by the National Institute for 
Communicable Disease Control and Prevention. All surgery was 
performed under ether anesthesia, and all efforts were made to 
minimize suffering. 


Specimen collection 


Rodents were trapped in cages using cooked food as bait during 
2011-2013 in Zhejiang province, China (Fig. 1) (Mills et al., 1995). 
All animals were initially classified to a specific rodent species by 
morphological examination, and were further confirmed by 
sequence analysis of the mt-cyt b gene (Guo et al., 2013). All 
animals were anesthetized with ether before they were sacrificed, 
and every effort was made to minimize suffering. Tissue samples 
of liver, spleen, lung, kidney, and rectum were collected from 
animals for the detection of CoVs. 


CoV detection and full genome sequencing 


Total RNA was extracted from fecal or tissue samples using 
TRIzol (Invitrogen, Carlsbad, CA) according to the manufacturer's 
instructions. The RNA was eluted with RNase-free water and was 
used as the template for reverse transcription-PCR (RT-PCR) and 
deep sequencing. CoV RNA was detected by nested RT-PCR which 
amplified the RNA-dependent RNA polymerase gene (RdRp) of 
CoVs using conserved primers (sequence available on request). 
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Reverse transcription was undertaken using AMV reverse tran- 
scriptase (Promega, Beijing) according to the manufacturer's pro- 
tocol. The cDNA was amplified with the following PCR protocol: 
35 cycles of demodulation at 94 °C for 40s, annealing at 44 °C for 
40s and extending at 72 °C for 40s, with ddH,0 as a negative 
control. For CoV positive RNA extractions, pair-end (90 bp) 
sequencing was performed on the HiSeq 2000 (Illumina) platform. 
The library preparation and sequencing steps were performed by 
the BGI Tech Corporation (Shenzhen, China) following a standard 
protocol provided by Illumina. The resulting sequencing reads 
were then assembled de novo by the Trinity program (Grabherr 
et al., 2011) into 152,684 contigs (> 200 bp). BLASTx was per- 
formed to retrieve the CoV full genome sequences from the 
assembled contigs. These sequences were further verified using 
Sanger sequencing methods with primers designed based on the 
deep-sequencing results. To amplify the terminal ends, 3’ and 5’ 
RACE kits (TaKaRa, Dalian, China) were used. 


Nucleotide sequence accession numbers 


The sequences generated in this study have been deposited in 
GenBank and assigned accession numbers KF294379-KF294380, 
KF294358-KF294372, KF294345-KF294357 for those representing 
coronaviruses, and KF294387-KF294416 for the host mt-cyt b 
genes (Table S1). 


Evolutionary analyses 


Because of extensive sequence divergence between the nucleo- 
tide sequences of different coronavirus genera, all phylogenetic 
analyses were based on amino acid sequences. Accordingly, amino 
acid sequence alignments were performed using the MAFFT 
algorithm (Katoh and Standley, 2013). After alignment, gaps and 
ambiguously aligned regions were removed with Gblocks (v0.91b) 
(Talavera and Castresana, 2007). Phylogenetic analyses were then 
performed using the sequences of five CoV proteins: (i) 3CLP", (ii) 
RdRp, (iii) Hel, (iv) spike protein (S), and (v) the nucleocapsid 
protein (N). Phylogenetic trees were estimated using the max- 
imum likelihood (ML) method implemented in PhyML v3.0 
(Guindon et al., 2010) with bootstrap support values calculated 
from 1000 replicate trees. The best-fit amino acid substitution 
models (LG+Ir for 3CL?°, LG+I'+I for Hel, RdRp, S and N) were 
determined using MEGA version 5 (Tamura et al., 2011). The 
following data set sizes were used in the final analysis: 
3CLP’°=290 amino acids (aa), RdRp=869 aa, Hel=581 aa, 
S=429 aa, N= 138 aa. 

The TMHMM program (version 2.0; www.cbs.dtu.dk/services/ 
TMHMM/) was used to predict the transmembrane domains, while 
the Signal P program (version 4.0; http://www.cbs.dtu.dk/services/ 
SignalP/) was to determine signal sequences. Protein family 
analysis was performed using PFAM and InterProScan (Apweiler 
et al., 2001; Bateman et al., 2002). 

Following visual inspection of the amino acid phylogenies, 
potential recombination events were identified in complete gen- 
ome (nucleotide) sequences using the Recombination Detection 
Program v4 (RDP4), employing the RDP, GENECONV, bootscan, 
maximum chi square, Chimera, SISCAN, and 3SEQ methods 
(Martin et al., 2010) (with default parameters). All analyses were 
performed with a Bonferroni corrected P-value cutoff of 0.01. 
When putative recombination events were observed by two or 
more methods and with significant phylogenetic (topological) 
incongruence, the viral sequences were considered as potentially 
recombinant. To further characterize these recombination events, 
particularly the location of breakpoints, we inferred similarity 
plots using Simplot version 3.5.1 (Lole et al., 1999). For each of the 
putative recombinant regions, phylogenies were estimated using 


the ML method performed with PhyML v3.0 (Guindon et al., 2010) 
under the best-fit substitution model determined by jModelTest 
(Posada, 2008). 
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